Goto

Collaborating Authors

 Dumfries and Galloway



UK lacks plan to defend itself from invasion, MPs warn

BBC News

The UK lacks a plan to defend itself from military attack, a committee of MPs has warned. In a highly critical report, the defence committee says the UK is over-reliant on US resources and that preparations to defend itself and overseas territories in the event of attack are nowhere near where they need to be. The committee's chair, Labour MP Tan Dhesi, said: Putin's brutal invasion of Ukraine, unrelenting disinformation campaigns, and repeated incursions into European airspace mean that we cannot afford to bury our heads in the sand. It comes as the Ministry of Defence (MoD) identified parts of the country where six or more new munitions factories could be built. In June, Defence Secretary John Healey announced plans to move the UK to war-fighting readiness, including £1.5bn to support the construction of new munitions factories, which will be built by private contractors.



LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data

Yang, Cehao, Lin, Xueyuan, Xu, Chengjin, Jiang, Xuhui, Ma, Shengjie, Liu, Aofan, Xiong, Hui, Guo, Jian

arXiv.org Artificial Intelligence

Despite the growing development of long-context large language models (LLMs), data-centric approaches relying on synthetic data have been hindered by issues related to faithfulness, which limit their effectiveness in enhancing model performance on tasks such as long-context reasoning and question answering (QA). These challenges are often exacerbated by misinformation caused by lack of verification, reasoning without attribution, and potential knowledge conflicts. We propose LongFaith, a novel pipeline for synthesizing faithful long-context reasoning instruction datasets. By integrating ground truth and citation-based reasoning prompts, we eliminate distractions and improve the accuracy of reasoning chains, thus mitigating the need for costly verification processes. We open-source two synthesized datasets, LongFaith-SFT and LongFaith-PO, which systematically address multiple dimensions of faithfulness, including verified reasoning, attribution, and contextual grounding. Extensive experiments on multi-hop reasoning datasets and LongBench demonstrate that models fine-tuned on these datasets significantly improve performance. Our ablation studies highlight the scalability and adaptability of the LongFaith pipeline, showcasing its broad applicability in developing long-context LLMs.


RAG-Reward: Optimizing RAG with Reward Modeling and RLHF

Zhang, Hanning, Song, Juntong, Zhu, Juno, Wu, Yuanhao, Zhang, Tong, Niu, Cheng

arXiv.org Artificial Intelligence

Retrieval-augmented generation (RAG) enhances Large Language Models (LLMs) with relevant and up-to-date knowledge, improving their ability to answer knowledge-intensive questions. It has been shown to enhance both generation quality and trustworthiness. While numerous works have focused on improving retrieval, generation, and evaluation, the role of reward models in reinforcement learning for optimizing RAG remains underexplored. In this paper, we introduce \textbf{RAG-Reward}, a framework designed to develop reward models to enable \textit{hallucination-free, comprehensive, reliable, and efficient RAG}. We define four key metrics to assess generation quality and develop an automated benchmarking pipeline to evaluate the outputs of multiple LLMs across a variety of RAG scenarios. Using \textbf{RAG-Reward}, we train reward models and apply {reinforcement learning with human feedback (RLHF)} to improve LLMs' effectiveness in RAG. Experimental results demonstrate that our reward model achieves state-of-the-art performance in automatic benchmarking and aligns closely with human evaluations. Furthermore, the improved generation quality of the trained policy model highlights the feasibility and efficiency of using RLHF to enhance RAG outputs.


Man guilty of army veteran hammer attack murder

BBC News

Man guilty of army veteran hammer attack murder Cumbria PoliceJack Crawley attempted to burn Paul Taylor's body, before burying him in woodland A man who attacked an army veteran he had met for sex and bludgeoned him with a hammer has been found guilty of murder. Paul Taylor, 57, from Annan, Dumfriesshire, went missing last October, with his remains found in a shallow grave in woodland near Carlisle, Cumbria, in May. Jack Crawley, 20, of Carlisle, was found guilty of attacking him and trying to burn his body following a trial at the city's crown court. He will be sentenced on Wednesday. Crawley was also found guilty of the attempted murder of a man in York, who he met on the gay dating app Grindr and also attacked with a hammer, while he was on bail for killing Mr Taylor.


LLM Evaluators Recognize and Favor Their Own Generations

Panickssery, Arjun, Bowman, Samuel R., Feng, Shi

arXiv.org Artificial Intelligence

Self-evaluation using large language models (LLMs) has proven valuable not only in benchmarking but also methods like reward modeling, constitutional AI, and self-refinement. But new biases are introduced due to the same LLM acting as both the evaluator and the evaluatee. One such bias is self-preference, where an LLM evaluator scores its own outputs higher than others' while human annotators consider them of equal quality. But do LLMs actually recognize their own outputs when they give those texts higher scores, or is it just a coincidence? In this paper, we investigate if self-recognition capability contributes to self-preference. We discover that, out of the box, LLMs such as GPT-4 and Llama 2 have non-trivial accuracy at distinguishing themselves from other LLMs and humans. By fine-tuning LLMs, we discover a linear correlation between self-recognition capability and the strength of self-preference bias; using controlled experiments, we show that the causal explanation resists straightforward confounders. We discuss how self-recognition can interfere with unbiased evaluations and AI safety more generally.


On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization

Flores, Lorenzo Jaime Yu, Cohan, Arman

arXiv.org Artificial Intelligence

Text summarization and simplification are among the most widely used applications of AI. However, models developed for such tasks are often prone to hallucination, which can result from training on unaligned data. One efficient approach to address this issue is Loss Truncation (LT) (Kang and Hashimoto, 2020), an approach to modify the standard log loss to adaptively remove noisy examples during training. However, we find that LT alone yields a considerable number of hallucinated entities on various datasets. We study the behavior of the underlying losses between factual and non-factual examples, to understand and refine the performance of LT. We demonstrate that LT's performance is limited when the underlying assumption that noisy targets have higher NLL loss is not satisfied, and find that word-level NLL among entities provides better signal for distinguishing factuality. We then leverage this to propose a fine-grained NLL loss and fine-grained data cleaning strategies, and observe improvements in hallucination reduction across some datasets. Our work is available at https://https://github.com/yale-nlp/fine-grained-lt.


How to Discern Important Urgent News?

Vasilyev, Oleg, Bohannon, John

arXiv.org Artificial Intelligence

We found that a simple property of clusters in a clustered dataset of news correlate strongly with importance and urgency of news (IUN) as assessed by LLM. We verified our finding across different news datasets, dataset sizes, clustering algorithms and embeddings. The found correlation should allow using clustering (as an alternative to LLM) for identifying the most important urgent news, or for filtering out unimportant articles.


Forcing Generative Models to Degenerate Ones: The Power of Data Poisoning Attacks

Jiang, Shuli, Kadhe, Swanand Ravindra, Zhou, Yi, Cai, Ling, Baracaldo, Nathalie

arXiv.org Artificial Intelligence

Growing applications of large language models (LLMs) trained by a third party raise serious concerns on the security vulnerability of LLMs. It has been demonstrated that malicious actors can covertly exploit these vulnerabilities in LLMs through poisoning attacks aimed at generating undesirable outputs. While poisoning attacks have received significant attention in the image domain (e.g., object detection), and classification tasks, their implications for generative models, particularly in the realm of natural language generation (NLG) tasks, remain poorly understood. To bridge this gap, we perform a comprehensive exploration of various poisoning techniques to assess their effectiveness across a range of generative tasks. Furthermore, we introduce a range of metrics designed to quantify the success and stealthiness of poisoning attacks specifically tailored to NLG tasks. Through extensive experiments on multiple NLG tasks, LLMs and datasets, we show that it is possible to successfully poison an LLM during the fine-tuning stage using as little as 1% of the total tuning data samples. Our paper presents the first systematic approach to comprehend poisoning attacks targeting NLG tasks considering a wide range of triggers and attack settings. We hope our findings will assist the AI security community in devising appropriate defenses against such threats.